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Abstract — Formal methods provide remarkable tools allowing 
for high levels of confidence in the correctness of developments. 
Their use is therefore encouraged, when not required, for the 
development of systems in which safety or security is mandatory. 
But effectively specifying a secure system or deriving a secure 
implementation can be tricky. We propose a review of some 
classical 'gotchas' and other possible sources of concerns with 
the objective to improve the confidence in formal developments, 
or at least to better assess the actual confidence level. 

I. Introduction 

Formal methods applied to the development of systems or 
software are very efficient tools that allow for high levels of 
assurance in the validity of the results. By defining languages 
with clear semantics and by making explicit how to reason 
on these languages, they provide a mathematical framework 
in which it is possible to ensure the correctness of imple- 
mentations. Formal guarantees are often unreachable by more 
classical approaches; for example they are exhaustive whereas 
tests cover only a part of the possible executions. 

For these reasons, the use of formal methods is encouraged, 
when not required, by standards for the development of 
systems in which safety is mandatory, e.g. lEC 61508 [1]. The 
situation is similar for the development of secure systems: for 
the highest levels of assurance the Common Criteria iCC, [2]) 
require the use of formal methods to improve confidence in 
the development, as well as to ease the independent evaluation 
process. Indeed, the verification that the delivered product 
complies with its specification is expected to rely, at least to 
some extent, on a mechanically checked proof of correctness. 

One should not however confuse safety with security. They 
are overlapping but none includes the other Safety mostly 
aims at limiting consequences of random events (dealing 
with probabilities) and security at managing malicious actions 
(dealing with the difficulty of an attack). In this paper, we 
discuss a few concerns more specifically related to the formal 
development of secure systems. These concerns are illustrated 
through simple examples (sometimes involving a malicious 
developer) in Coq [3] or in B [4] but most of them are relevant 
for other deductive formal methods such as FoCal [5], PVS 
[6], Isabelle/HOL [7], etc. 
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II. Formal methods 

Standard development processes identify several phases 
such as specification, design, implementation and verification 
operations. Different languages can be used for different 
phases; beyond programming languages it is frequent to use 
natural language, automata, graphical languages, UML, etc. 
The problem of the correctness of a development can then 
be seen as a problem of traceability between the various 
descriptions of the system produced at different phases. 

Formal methods considered in this paper also allow for 
multiple descriptions of a system; they differ from standard 
approaches by enforcing the use of languages with explicit 
and clear semantics, and by providing a logical framework 
to reason on them. Ensuring the correctness then becomes a 
mathematical analysis of the traceability (or consistency). 

A. About formal specification 

At least two descriptions of a system are generally con- 
sidered in formal methods, a formal specification and an 
implementation. The specification is often written in a logical 
language (e.g. based on predicates) and is ideally declarative, 
abstract, high-level and possibly non-deterministic, describing 
the what. On the other hand, the implementation is imperative, 
concrete, low-level, and deterministic, describing the how. To 
emphasise the difference between declarative and imperative 
approaches, consider the specification of the integer square 
root function, <n< which is deterministic (for 

any n there is at most one acceptable value for ^/n) but is not 
a program: how it is computed is left to the developer 

Simply writing a formal specification is already an im- 
provement compared to standard approaches. Indeed, by using 
a formal language, ambiguities are resolved. Furthermore, 
formal methods provide ways to check, at least partially, the 
consistency of the specification. 

B. About refinement 

The process of going from a specification to an implemen- 
tation, while checking the compliance, is called refinement. 
This concept captures the activity of designing a system; 
it encompasses a lot of subtle activities, including choosing 
concrete representations for abstract data or producing op- 
erational algorithms matching declarative descriptions. From 
a logical point of view, a formal specification describes a 



family of models (that is, intuitively, implementations) and the 
refinement process consists of choosing one of those models. 

Any formal method defines, implicitly or explicitly, a form 
of refinement. The definition generally ensures that the re- 
finement is transitive (allowing for an arbitrary number of 
refinement steps in the development process) and monotone 
(allowing for the decomposition of a problem into several sub- 
problems that will be refined independently). 

Formal methods do not automatically produce refinements 
but explain how to check that a refinement is valid, that is they 
ensure that very different objects (a logical description and an 
operational implementation) are sufficiently 'similar'. To allow 
for this comparison the refinement is by nature extensional: the 
objects are seen from a functional point of view, as black boxes 
whose only inputs and outputs are relevant. If the specification 
is to sort values, any sorting algorithm is valid and any two 
algorithms are considered equal (undistinguishable). Note that 
the word intensional is often used to refer to properties that 
are not extensional; for example, execution time or memory 
use for a sorting algorithm can be considered as intensional. 

C. About logic 

Behind any formal method, there is logic - or more accu- 
rately, a logic. It is not the intent of this paper to discuss at 
length the various types of logic, once pointed out the common 
fact that a specification can be inconsistent. 

A specification is inconsistent if it is self-contradictory - a 
trivial example is to specify t; as a natural value equal to both 
and 1. Such a specification is also said to be unsatisfiable, 
that is it does not admit a model in the logical sense. There 
are three important points about inconsistent specifications: 

• the detection of inconsistency cannot be automated in the 
general case (the problem of satisfiabihty is undecidable); 

• an inconsistent specification cannot be implemented; 

• an inconsistent specification can prove any property. 
Tools implementing formal methods considered in this paper 
do not even try to detect inconsistencies (even the trivial 
example of « = = 1), due to the undecidability as well as 
because the aim of a formal development being an implemen- 
tation, any inconsistency is detected sooner or later. The last 
point results from the fact that for any proposition P we have 
False =>P (using false assumptions one can prove anything). 
The consequences of these points are discussed in this paper. 

III. A SHORT PRESENTATION OF B AND Coq 
A. About B 

The B Method [4] is a formal method widely used by both 
the academic world and the industry. Beyond the well-known 
examples of developments of safety systems (e.g. [8]), it is 
also recognised for security developments. 

B defines a first-order predicate logic completed with el- 
ements of set theory, the Generalised Substitution Language 
(GSL) and a methodology of development in which the notion 
of refinement is explicit and central. The logic is used to 
express preconditions, invariants and to conduct proofs. The 
GSL allows for definitions of substitutions that can be abstract. 



declarative and non-deterministic (that is, specifications) as 
well as concrete, imperative and deterministic (that is, pro- 
grams). The following example uses the non-deterministic 
substitution ANY (a 'magic' operator finding a value which 
satisfies a property) to specify the square root of a natural 
number n: 

ANY X WHERE .x^ < n < (x + 1)^ THEN ^{n) ■- x 

The notion of refinement is expressed between machines 
(modules combining a state defined by variables, properties 
such as an invariant on the state and operations encoded 
as substitutions to read or alter the state) and captures the 
essence of program correctness w.r.t. their specification as 
follows: an implementation refines a specification if the user 
cannot exhibit a behaviour of the implementation that is not 
compUant with what is required by the specification. This 
concept is incorporated into the methodology by the automated 
generation of proof obligations at each refinement step, and is 
sustained by mathematical justifications not detailed here. 

One of the characteristics of the refinement in B is that 
it is independent of the internal representation used by the 
machines, as illustrated by the following example of a system 
returning the maximum of a set of stored natural values: 

MACHINE Ma 

VARIABLES S 

INVARIANT 5CN 

INITIALISATION S — 

OPERATIONS 

store(n) = PRE neN THEN S:=S'U{n} 
m^get = PRE 5^0 THEN m := max(S') 

END 

MACHINE Mc REFINES Ma 
VARIABLES s 
INVARIANT s = max(S'U{0}) 
INITIALISATION s:=0 
OPERATIONS 

store{n) = IF s < n THEN s :=n 

m^get = m:=s 
END 

The state of the machines is described in the VARIABLES 
clause; for the specification Ma it is a set of natural num- 
bers and for the implementation Mc a natural number. The 
INVARIANT clause defines a constraint over the state; 
for Ma it indicates that S is a subset of N, whereas for 
Mc it describes the glue between the states of Ma and Mc 
(intuitively claiming that if both machines are used in parallel 
then s is always equal to max(S)). The INITIALISATION 
clause sets the initial state, while the OPERATIONS clause 
details the operations used to read or alter the state. The two 
machines differ yet Mc refines Ma: roughly speaking one 
cannot exhibit a property of Mc which contradicts one of Ma. 

Note the use of the PRE substitution defining a precondi- 
tion, that is a condition that the user has to check before calhng 
an operation. This is an offensive approach; an operation 
(should not but) can be used when this condition is not 
satisfied, yet in such a case there is no guarantee about the 



result (it may even cause a crash). By opposition the defensive 
approach is represented in B by using guards (that is an IF) 
that prevent unauthorised uses. These notions are standard in 
formal methods and will be discussed further later in this 
paper 

B. About Coq 

Coq is a proof assistant based on a type theory. It offers a 
higher-order logical framework that allows for the construction 
and verification of proofs, as well as the development and 
analysis of functional programs in an ML-like language with 
pattern-matching . 

Coq implements the Calculus of Inductive Constructions [9] 
and it is frequent in developments to use inductive definitions. 
For example, N is defined in the Peano style as follows: 

Inductive N — : N I 5:N^N 

This definition means that N is the smallest set of terms closed 
under (finite number of) applications of the constructors and 
S. N is thus made of the terms and S"(0) for any finite n; 
being well-founded, structural induction on N is possible (the 
induction principle is automatically derived by Coq after the 
definition of N). The definition also means that N contains 
no other values (surjectivity) and that V (n:N), 0=/S(n) and 
V(mn:N), S{rn) = S(n)^rn = n (injectivity). 

Contrary to B, there is no enforced development methodol- 
ogy in Coq, nor any explicit refinement process. The user can 
choose between several styles of specification and implemen- 
tation, and has to decide on its own about the properties to 
be checked. For example the weak specification style consists 
of defining functions as programs in the internal ML-like 
language and later checking properties of these functions, as 
illustrated here by the division by 2: 

Fixpoint<iiv2(a::N):N := 

match X with S{S{x')) S{div2{x')) | _ ^ end. 

Theorem div2_def : 

forall (x-.'N), n = 2*div2{n) V n = 2*div2(n) + l. 
Proof. 

Qed. 

div2 is a recursive program (using div2{x+2)—div2{x)+l) and 
div2_def a property claimed about it; the proof, not detailed 
here, ensures that div2 indeed satisfies div2_def. 

IV. Specifying secure systems 

We now begin our discussion about developing secure sys- 
tems using formal methods by considering more specifically 
formal specifications of secure systems. 

To start with trivial considerations, we first have to note that 
formal methods offer tools to express specifications but that 
there is no way to force a developer to describe the properties 
required of the system under development. Clearly, using even 
the most efficient formal method without adopting the 'formal 
spirit' is meaningless, as there is no benefit compared to stan- 
dard approaches if the formal specification is empty. Note also 
that a formal development is a development, and so can also 



benefit from standard practices such as naming conventions, 
modularity, documentation, etc. In the case of formal methods, 
in fact, the very process of deriving a formal specification from 
the book of specifications should be documented, justifying 
the formalisation choices and identifying, if any, aspects of 
the system left out (as it is generally not reasonable or even 
feasible to aim at a full formalisation of a complete system). 

Assuming a developer that has adopted the formal spirit, 
there are further points to care about in order to develop an 
'adequate' formal specification for a secure system, that is a 
specification not only expressing the required properties, but 
also ensuring that those properties are enforced at all stages 
of the development as well as in any (reasonable) scenario of 
usage of the implementation. 

Some of the concerns that will be discussed below are 
applicable for safety or any high assurance system; for others 
a malicious developer will be assumed (a threat generally 
irrelevant for safety but applicable in security). The ultimate 
objective of such a malicious developer is to exploit any 
weakness of a specification, in order to trap a system while 
delivering a mechanically checked proof of compliance. One 
could consider that such traps would be detected through code 
review or testing. Yet, beyond the fact that formal methods are 
expected to reduce the need for such activities, we warn the 
reader that our illustrations are voluntarily simplistic, and that 
real life examples of Trojan Horse are difficult to detect. 



A. About invalid specifications 

As pointed out in Par III-CI inconsistent specifications are 
disastrous. Indeed, whereas inconsistency cannot be automati- 
cally detected, it also permits to discharge any proof obligation 
expressed - that is an inconsistent specification can in practice 
make the developer life more comfortable. An inconsistent 
specification is therefore dangerous for safety developments if 
a distracted developer fails to notice that its proofs are a little 
too easy to produce, and more so for security developments as 
a malicious developer identifying such a flaw would be able 
to prove whatever he wants. 

Of course, an inconsistent specification is not imple- 
mentable. It is therefore possible to check the consistency by 
providing an implementation - any one will do the trick, so 
even a dummy implementation is sufficient. Yet there are in 
security situations in which a formal specification is mandatory 
while a formal implementation is not. This is the case for 
the CC, at some assurance levels, that just require a formal 
specification of the Security Policy. An undetected inconsistent 
specification is therefore a possibility. 

In B the consistency of a specification is partially checked 
through proof obligations to be discharged by the developer. 
Yet the obligations related to the existence of values satis- 
fying the expressed constraints for parameters, variables and 
constants are deferred. Both following specifications are in- 
consistent, yet all explicit proofs obligations can be discharged 



(that is, most B tools will report a '100% proven' status): 



MACHINE absurd_var 
VARIABLES v 
INVARIANT 

uGN A 
v—Q A v=l 
ASSERTION = 1 



MACHINE absurd_cst 
CONSTANTS / 
PROPERTIES 

Va;,y, x <y^ f{x) > f{y) 
ASSERTION = 1 



Of course, delaying such proof obligations is justified, as 
implementing the specification will force the developer to 
exhibit a witness for v that meets the specification (a con- 
structive proof that the specification is satisfiable). Therefore, 
B ensures that any inconsistency is detected, at the latest, 
at the implementation stage. But we would like to remind 
the reader that a formally derived implementation is not 
always required. In such a case, one should consider additional 
manual verifications to check the existence of valid values for 
parameters, constants and variables. 

Inconsistencies can be rather easy to introduce, accidentally 
or not, by contradicting implicit hypotheses associated to the 
used formal method. In B for example there is a clause SETS 
that allows for the declaration of abstract sets used in a 
machine; one can easily forget that such a set is always in B 
finite and non-empty. If the developer contradicts one of these 
implicit hypotheses the specification becomes inconsistent 
without any warning by the tool; in fact the automated prover 
will very efficiently detect the contradiction as a lemma usable 
to discharge any proof obligation. Contradiction of implicit 
principles of the underlying logic can also be illustrated in 
Coq with two very simple examples. The first one is a naive 
tentative of specifying Z using N: 

Inductive Z : Set:=p/MJ:N TL \ minus -.N Z. 
Hypothesis zero_unsigned : plus{0) = minus{0) . 



Unfortunately, as pointed out in Par. IIII-BI the definition of Z 
is not a specification but an implementation (Z is the set of all 
terms of the form plusin) or minus{n)). zero_unsigned introduces 
an inconsistency because it contradicts the injectivity principle 
for the constructors: for any natural values n and m it is 
possible to prove in Coq that plus(n)j^minus{m). 

The second example is related to the unexpected conse- 
quences of using possibly empty types. This is illustrated by 
the following (missed) attempt to define bi-colored lists of 
natural values, that is lists with each element marked red or 
blue: 

Inductive blst : Set -.— red : hist N ^ blst 
blue : blst N ^ blst. 

In the absence of an atomic constructor for the empty list, 
blst which is the smallest set of terms stable by application 
of the constructors is indeed empty. Therefore, assuming the 
existence of such a list is inconsistent, and any theorem of the 
form V (6 : blst), P is provable - hardly a problem from the 
developer's point of view, as he generally tries to prove only 
those properties he expects. It would be prudent for any type 
T introduced in Coq, to ensure that it is not empty e.g. by 
proving a theorem of the form 3 {t:T), True. 



One could also investigate the satisfiability of the precondi- 
tions or guards, as defined in Par. IIII-AI associated to functions 
or operations. Indeed, while unsatisfiable preconditions are not 
inconsistent, they often represent a form of deadlock, as they 
mean that it is never possible to use an operation. They may 
however be difficult to detect - there is a famous example of 
the database of individuals developed in [4], in which it is 
impossible to insert new entries, as pointed out in [10], due 
to the fact that any new individual introduced in the database 
should have a father and a mother, while the initial state is an 
empty database. To avoid such difficulties the use of adequate 
tools (animation of models, model-checking, automatic tests 
generator, cf. [11]-[15]) can be of considerable help. 

We would also like to draw the attention of the readers 
to other types of problematic specifications. For example in 
some cases it may happen that a specification mixes predicates 
of the form P => Q and P ^Q. Such a specification is 
consistent but only as long as P is false; to the least this 
type of specification should be considered inappropriate. This 
is one of the cases for which specification engineering tools 
would be considered useful. Such tools associate for example 
to a specification V x, P Q an additional proof obligation 
3 X, P; indeed the specification can be vacuously true if P is 
always false, but it is unlikely that such a specification convey 
the intended meaning [16]. 

B. About (mis)understandings 

Consequences of invalid specifications have been identified 
and justify establishing procedures to check consistency. We 
now discuss the problem of insufficient specifications, which 
is more tricky to detect as it generally refers to a difference 
between a specification and its intended meaning. 

Our very first concern is related to the understanding of 
the chosen formal method. It is not reasonable to expect all 
'users' of formal methods to be expert. One may consider 
for example a situation in which a customer convinced by 
the interest of formal methods may however not have any 
in-depth knowledge about any of them. In fact, we would 
also argue that should formal methods be more widely used 
- definitely something we expect for the future - they should 
be accessible to people having received a dedicated training 
but which are not expert (this is one of the main objectives of 
the FoCal project [5], [17]-[19]). The minimum, however, is 
to ensure that any user has a basic understanding of some of 
the underlying principles to avoid misinterpretation. 

For example, consider the concept of refinement as intro- 
duced in Par III-BI The essence of this concept is to allow to 
check that specifications and implementations are 'similar'. 
This similarity should not be too strong, as a refinement 
relation reduced to intensional equality of programs (that is, 
the same code) would be useless. It is for example standard to 
consider that computations and transient states are irrelevant. 
In Coq this is translated by the fact that the equality is modulo 
/3-reduction (in other words, square{'i) — ^ because computing 
square{'i) yields 9). Our concern is illustrated in B by the 



following specification of an airlock system: 

MACHINE Sas 
VARIABLES doon , door2 
INVARIANT doori,door2 G {open, locked} ^ 

-^{doori — open A door2 = open) 
OPERATIONS 

open^ = IF door2 — locked THEN doon := open 

closei = doori -.— locked 

open2 = IF doori — locked THEN door2 := open 
close2 — door2 := locked 

If the underlying principles of the B are not understood, 
one can easily consider that the INVARIANT clause in a 
proven B machine is 'always true'. Therefore, any compliant 
implementation of this specification would be considered safe. 
Of course, this is not the case, as we may for example refine 
the operation open^ as follows: 

open-^ = IF door2 = locked THEN 
doori open; 

IF attack THEN door2 :— open; wait; door 2 := locked 

where wait is a passive but slow operation and attack any 
condition the malicious developer can imagine to obfuscate 
the dangerous behaviour during tests. 

If stronger forms of invariant are required, e.g. to take 
into account interruptions, specific modelisation choices or 
dedicated techniques are to be used (cf. [20]). 

C. About partial specifications 

Another aspect of a formal specification of a secure system 
to check is totality: is the behaviour of the system specified in 
any possible circumstance? It is frequent in formal methods 
to define partial specifications - either to represent a form of 
contract (a condition to be realised before having the right to 
use the system) or a form of freedom left to the developer (be- 
cause the systems is not planned to be used in such conditions 
or because the result is irrelevant). If the first interpretation 
can be considered during formal developments, the second 
one becomes the only relevant one once leaving the abstract 
world of formal methods to tackle with implemented systems. 
And the extent of the freedom given to the developer is easily 
underestimated, as illustrated in the following examples. 

We start by two specifications of the head function (returning 
the first element of a list of natural values) in Coq, in the strong 
specification stylfl 

headi{l:listK)(p:l^[]):{x:n\3l' -.listn, l = x::l'}. 
head2il:listK):{x:'N\l^[] ^3l':listn, l = x::l'}. 

Both specifications ensure that the function, called upon a 
non empty list, will return the head element. Yet the first 
specification is associated to a precondition, the parameter p 
being a proof that the list parameter / is not empty - making 
it impossible to call head\ over an empty list as it would not 
be possible to build such a proof. The second specification is 
on the contrary partial, allowing to use head2 with an empty 
list but not constraining the result in such a case (except for 
being a natural value). 

' In which the return value of a function is described as satisfying a property. 



The point is that these two specifications are not so different: 
all the logical parts of a Coq development are eliminated at 
extraction (the process that extract proved programs). This is 
not specific to Coq: by nature, logical contents in a formal 
development are not computable and have therefore to be 
discarded in some way before being able to produce a program. 
And it is easy to implement both specifications in a way that 
produces the same following OCainl code, where secret is any 
value the malicious developer would care to export: 

let head — function [ ] secret \ h :: _ ^ h 

We illustrate the same concern in B by the specification of a 
file system manager. We define the sets USR (users), Fil^ZFIL 
(files), CNT (contents) and RGT (access rights). Cnt associates 
for any file a content, Rgt associates for a user and a file 
the rights, and cpt gives the number of existing files. Various 
operations to create, delete or access the files are assumed to 
be specified but are not detailed here, except for read: 

yiKCmN^filesystem 
SETS USR; FIL; CNT; RGT={r, w} 
CONSTANTS cnul 
PROPERTIES cnule CNT 
VARIABLES Fil Cnt, Rgt, cpt 
INVARIANT FiZC F/LA 

Cnt £ Fil CNT A 

Rgt '^{USRx Fil) X ROTA 

cpt = card{Fil) 

INITIALISATION Fi7:=0 || Cnf-<D \\ Rgf-ID \\ cpf-Q 
OPERATIONS 

out ^ read(f, u) = 

PRE / G Fil Aug USR THEN 

IF {{u^f)^r)£ Rgt THEN out --Cnt^f) 
ELSE out — cnul 

read is specified as returning the content of a file /, provided 
that the user u has the right to read it. Yet it is only partially 
specified, as we do not describe what happens when the file 
does not exist. Any call of read implemented in B would be 
associated to a proof obligation to ensure that the precondition 
is met, but this constraint goes as far as goes the use of the 
B. So let's assume the following malicious refinement of read 
is called over a non existing file: 

out <— read(f, u) = 
IF/GFi7THEN 

IF {{u^f)^r)e Rgt THEN out := Cnt{f) 
ELSE out '. — cnul 

ELSE Fil-- FilU{fs} \\ 

Cnt--CntU{fs^S} II 
Rgt -. = RgtU {{enii-^ fs)>-^r} 

Whereas the specification of read was apparently passive (not 
modifying the state), this refinement creates a file fs storing 
a (confidential) value 5", file only accessible by an arbitrary 
user eni invented by the developer Furthermore the invariant 
is broken as fs is created yet not accounted for in cpt, that is 
fs is virtually invisible for the system. Note also that defining 
the returned value when the file does not exist is not even 



required by B; a malicious developer may however prefer to 
return cnul for a better obfuscation of its code. 

Clearly, a partial specification cannot enforce security, and 
one should favor a total (and defensive) specification. In B 
this would translate into using a IF instead of a PRE. 
When the condition associated to an IF substitution is not 
satisfied, the ELSE branch is executed - if it is absent it 
is equivalent to a skip substitution, that is it enforces to do 
nothing. On the contrary when the condition associated to 
a PRE substitution is not satisfied, there is absolutely no 
guarantee about the result. Note that the defensive approach 
(with redundant checks) is an implementation of the defence 
in depth concept. 

D. About elusive properties 

For our next point, we would like to emphasise that some 
concepts often encountered in security can be difficult to 
express in a formal specification. Confidentiality is a good 
example: while a formal specification may appear to implicitly 
provide confidentiality, one should be extremely careful about 
its exact meaning, as illustrated by the following example of 
a login manager in B. 

The system state is defined by Acc C USR the accounts, 
log to identify the currently logged account {nouser encoding 
no opened session), and Pwd to associate to any account a 
password. This last piece of information is confidential and 
should not be disclosed. Operations (not detailed in this paper) 
allows to log, exit, create or destroy an account, with only the 
log operation specified as depending upon Pwd to represent 
the confidentiality of this data. The operation accounts, detailed 
here, returns the existing accounts: 

MACHINE login 
SETS USR; PWD 
CONSTANTS root, nouser 

PROPERTIES roote USR A nouser e USR\{root} 

VARIABLES Acc, log, Pwd 

INVARIANT 

AcfC USR A roote Acc A nouser^ Acc A 

log eAccU {nouser} A PwdeAcc^PWD 
INITIALISATION 

Acc:— {root} \\ log:— nouser \\ Pwd :£ {root} ^ PWD 
OPERATIONS 

out <— accounts = 
IF log & Acc THEN 

ANY s WHERE s G seq({/5'i?) A 
ran(s) =AccA 
size(s) =card(Acc) 

THENot(f:=s 
ELSEoMf:=0 

Input and output values being not refinable in B (cf. Par llll-Al ). 
the type of the return value of accounts has to be finalised in the 
specification. In our example, we have chosen to implement 
the set Acc returned by accounts as a list (or sequence in the B 
terminology) s of values of USR; ran(s) =Acc ensures that the 
same values appear in Acc and s, size(s) =card(Acc) that the 
length of the list s is equal to the cardinal of Acc. The proposed 



malicious refinement of accounts is the following one: 

out ^ accounts = 
IF log & Acc THEN 

ANY s WHERE s e seci{USR)A 
ran(s) =AccA 
size(s) —card{Acc) 
THEN IF Pwd{root) < guess 
THEN ojtf — sort (s) 
ELSE oMf :=rev(sort(s)) 
ELSE OM?:=0 

where guess is a new variable controlled by the malicious de- 
veloper. Combining calls to accounts and changes of guess, one 
can quickly derive Pwd{root) through the artificial dependency 
introduced in the returned value. 

This example illustrates a covert channel exploit [21], as 
discussed in [22]. Even if the implementation stores Pwd in 
a private memory location protected by a trusted operating 
system - a rather optimistic assumption - its confidentiality 
cannot be guaranteed without a form of control over depen- 
dencies (e.g. considering data-flow). 

It is of course possible to impose a complete (or monomor- 
phic) specification [23] - a deterministic specification, en- 
forcing the extensional behaviour of any implementation. A 
complete specification would not let any freedom to the de- 
veloper and thus would ensure that there is no covert channel 
to be exploited. In our example, a complete specification 
would for example require s to be sorted in ascending order 
This is however an impractical technical solution, an indirect 
mean to ensure confidentiality. Furthermore completeness is 
not expressible in the B specification language (or in most 
languages considered in this paper), is generally undecidable 
and is not stable by refinement of the representation of the 
data - e.g. refining a set by an ordered structure. 

It is also possible to better control dependencies in B by 
specifying operations using constant functions. The following 
modified specification claims that the operation accounts be- 
haves like a function depending only upon the set Acc and 
returning a list of values of USR: 

CONSTANTS . . . ,fct 

PROPERTIES . . . A/ffGP((75i?)^seq({/5i?) 
OPERATIONS 

out <— accounts = 

IF log & Acc THEN out:=fct{Acc) ELSE ou/ — 

This approach is not yet fully satisfactory as only the depen- 
dencies for the result are described (the extensional point of 
view). It is therefore still possible to affect the behaviour of 
accounts, as in this valid refinement: 

out <— accounts = 

out := encode{Pwd{root)); 

IF Pwd{root)< guess THEN wait(lO) ELSE wait{20); 
IF log e Acc THEN out :=fct(Acc) ELSE out := 

In this refinement the malicious developer implements both a 
timed channel as well as a possibly observable transient state 
of the output. 

This illustration is just intended to show why, in some cases, 
expressing confidentiality can be difficult. For such properties. 



complementary approaches should be considered, based e.g. 
on dependency calculus or non-interference [24], [25], and 
associated to standard code analysis. Note that confidentiality 
is often formally addressed through access control enforced 
by a form of monitor, that is according to the Orange Book a 
tamperproof, unavoidable, and 'simple enough to be trusted' 
mechanism filtering accesses (cf. recent discussions in [26]- 
[28]). Such a monitor can itself implement this type of covert 
channel attacks if it is poorly specified. Note also that the 
confidence in a system implementing a monitor relies on the 
confidence in the information used by this monitor, such as 
the source of an access request (that would require a form of 
authentication) as well as the level of protection required by 
the accessed object (a meta-information whose origin is gen- 
erally unclear, but for which effective implementations such 
as security labels protected in integrity have been proposed). 

We mention authentication and integrity to point out another 
source of rather elusive properties, that is the characterisation 
of cryptographic functions. For example, a (cryptographic) 
hash function H is such that: 

• given h it is not possible to find x s.t. H{x) = h; 

• given X it is not possible to find yj^x s.t. H{x) = H{y); 

• it is not possible to find x^y s.t. H(x) = H(y). 

The first property, for example, guarantees the security of 
the Unix login scheme; being able to specify a hash function 
(without giving any details on its implementation) by formally 
describing these properties has therefore some interest to 
certify such a scheme. Yet these properties appear to be rather 
difficult to express formally. A naive translation of the last 
property would just say that H is injective, which is false (as 
H projects an infinite set in a finite set of binary words of 
fixed length) and would lead to an inconsistent specification. 
Formally expressing such properties is possible, but generally 
less straightforward than one may expect. 

E. About the refinement paradox 

Most of the examples detailed in Pars. IIV-CI and IIV-DI are 
illustrations of what is often referred to as the refinement 
paradox: some properties are preserved by refinement (safety 
ones generally are), other are not (security ones). 

Back to the discussion of Par. IIV-DI the most simple 
example of 'devious' refinement that we can exhibit in B is 
the following one: 

MACHINE Boolean 

OPERATIONS out ^ go = out — true [| out —false 

This machine is a very simple one, having no state and defining 
a single operation go returning a boolean value. There are of 
course two straightforward refinements: 

MACHINE Boolean_True 
REFINES Boolean 
OPERATIONS out^go = out:=true 

MACHINE Boolean_False 
REFINES Boolean 

OPERATIONS out ^ go = out —false 



Yet it also accepts other refinements, such as the following 
one: 

MACHINE BooleanjCovertjChannel 
REFINES Boolean 
VARIABLE dump 
INVARIANT dump G N 
INITIALISATION dump — private_key 
OPERATIONS out ^ go = IF dump mod 2 = 

THEN out — true 
ELSE out — false; 
dump — dump/ 2 

One should not believe that the refinement paradox is specific 
to those methods which are providing an explicit form of 
refinement, such as B or Z for example. Our devious re- 
finements include implicitly a non functional refinement of 
the representation of data: we accept several implementations 
as representing a single abstract value of the specification. 
This intuitively describes why some variables are hidden at 
the specification level. From this intuition, we suggest the 
following counterpart in Coq of the refinement paradox. Let's 
consider the example of the specification of booleans as an 
Abstract Data Type, with the equality and a boolean function: 

Module Type Boolean_Function. 

Parameter B : Set. 

Parameters T _L : B. 

Parameter =:B^B^ Prop . 

Hypothesis re^:V (6 h = h. 

Hypothesis sym :V(6i62:S), &i = &2^fe2 = fei. 

Hypothesis tran-N (6i 62 b^-.B), 61 =62 — > 62 = 63 — > &i =63- 

Hypothesis 7>y:^T = _L. 

Hypothesis w;:V(fe:B), 6 = T V 6=_L. 

Parameter /nc -.B-^B. 
End Boolean_Function. 

The straightforward refinement of this specification is of 
course to implement B as B, the Coq type of booleans, and to 
implement fnc as one of the four possible boolean functions 
{true, false, identity or not). But a devious implementation 
gives much more freedom; we can for example choose to 
implement B as N, even values representing _L and odd values 
representing T: 

Module CovertjChannel : Boolean_Function. 
Definition B — N. 
Definition _L := 0. Definition T ~ 1 . 
Definition = (61 62 : B) — (61 + 62 mod 2 = 0). 

Definition/nc(6:B) :B:= match ((6/2) mod 4) with 
I ^ _L 
|1 ^ T 
j 2 ^ 6 

end. 

End CovertJOhannel . 

This implementation introduces a new dimension in the repre- 
sentation of the data, which is hidden at specification level and 
can be used by a malicious developer to store information and 
modify results: fnc now emulates any of the boolean functions. 

Note that the term of refinement paradox may be considered 
an overstatement, provided the presentation of refinement in 
Par. lII-BI Clearly the very concept of refinement is extensional. 



whereas on the contrary confidentiaHty can be considered as 
intensional: rather than describing what a result should be, it 
aims at constraining how a result is produced (in this case, 
without depending upon the confidential value). Similarly, if 
refinement is intended to preserve properties described in a 
specification, it does not aim at preserving properties of the 
specification itself, or any other form of meta-properties; so 
the fact that for example completeness is not preserved should 
not be a surprise. 

V. Building on Sand? 

In Par. IIV-AI we have shown possible consequences of 
inconsistent specifications. Obviously similar or worse con- 
sequences can result from other sources of inconsistencies, 
such as a bug in the tool implementing the formal method, 
or a mistake in the theory of the formal method itself. For a 
malicious developer, a paradox (a flaw in the logic that can 
be used to prove at the same time both P and ^P) discovered 
in a theory or in a tool can be used to prove any property 
about any development, that is to implement any unpleasant 
behaviour while getting a certification. 

When trying to assess the level of confidence one may 
have in the result of a formal development, the question of 
the validity of the tool and of the theory should therefore be 
addressed. 

A. About the logic 

In [29] a deep embedding (cf. [30], [31]) of the B logic in 
Coq is described, that is intuitively a form of B virtual machine 
developed in Coq with the objective to check the validity of 
the B logic. While this deep embedding has not identified any 
paradojqj, it has shown that the following 'theorems' from [4] 
are in fact not provable using the defined logic: 

Ei'^Fi = E2^F2 Ei = E2 
Ei^Fi = E2^F2 ^ Fi = F2 

Si C 52 A Ti C Ta ^ 5*1 X Ti C 52 X Ta 

These results are not provable because of the definition of the 
B inference rules, which are not sufficiently precise regarding 
the formal definition of what is a cartesian product. To our 
knowledge, the fact that these results were not valid in B was 
not known by the B community. Being apparently trivial, they 
were never checked and have been integrated for example in 
pro vers for the B logic. That means, at a fundamental level, 
that these results were in fact taken as additional axioms, 
without people knowing it - an approach that could have 
created a paradox in the logic. 

Further investigations have emphasised another form of 
subtile glitch that may appear in the theory of a formal method. 
As pointed out in Par. III-AI formal methods allow for multiple 
descriptions of a system as well as the verification of the 
similarity of these descriptions. This is sometimes obtained 
by defining several semantics for a single construct. 

In B, substitutions of the GSL (used to write opera- 
tions) are defined as predicate transformers, that is a log- 
ical semantic. On the other hand the substitutions of the 

-The consistency of the B logic has not been proved either. 



BO sub-language are used for implementation and also 
have an operational semantic. This is the case of the 
WHILE P DO 5 INVARIANT I VARIANT V substitution, 
illustrated in [4] by the extraction of the minimum of a non- 
empty set of natural values: 

WHILE a; ^S 

DOa;:=a;-|-l 
x~0; INVARIANT xG[0,min(S)] 

VARIANT min(S') 
END 

Using the definition of the WHILE substitution as a predicate 
transformer, one can indeed show that this substitution realises 
(that is, transforms into a tautology) the predicate a; = min(5). 
In other words the substitution is proven to extract the mini- 
mum in any case of use (provided S/0). 

By denoting || the translation producing a C program from 
a BO substitution, the operational semantic is defined by: 



WHILE P 
BOS 

INVARIANT / 
VARIANT V 



= while mm} 



The interesting point is that this semantic forgets / (the 
loop invariant) and V (the loop variant) that are pure logical 
contents, important for the proofs (e.g. of termination) but 
irrelevant for the execution. 

Modifying the invariant does not change the program (the 
operational semantic) and should therefore only have limited 
impact on the logical semantic. The surprise is that by re- 
placing in the previous example the invariant me [0,min(5')] 
by m G N, less precise but still correct, the logical semantic 
is radically modified. This modified logical semantic leads to 
a refutation of the previous proposition, that is it indicates 
that the substitution is not always extracting the minimum. 
A rather strange conclusion, as both versions of the logical 
semantic describe the same program. 

We have also identified a similar concern with Coq. In 
this case there is a single language, mixing logical and 
computational constructs, an extraction mechanism allowing 
for the elimination of the former to derive from the latter a 
program in a functional language, e.g. in OCaml. 

As akeady pointed out in Par. IIV-AI an inductive definition 
such as Inductive E : Set — nxt : E ^ E lacks an atomic 
constructor and is therefore empty. Emptyness is not, by itself, 
inconsistent but makes possible to prove any result of the 
form V (e:iJ), P. Its extraction in OCaml is a straightforward 
translation to type E = Nxt of E. The interesting point is 
that this OCaml type is not empty, as it contains the value 
letrece = A'x?(e), not valid in Coq but making possible to use 
a program extracted from a fully certified Coq library with 
unexpected (and therefore unwanted) behaviours. 

It is beyond the scope of this paper to further discuss 
these questions, once noted that any such bias is a potential 
weakness usable by a malicious developer (or a trap for 
an honest but inattentive developer). These remarks are not 
intended to criticize the tremendous work represented by the 



full development of the theories supporting formal methods. 
They however justify the interest in mechanically checking 
such theories, pursuing works described e.g. in [32]-[34]. 

B. About the tools 

Beyond the concerns about the theory, one may also ques- 
tion the validity of the tool implementing a formal method. 
For example a prover can be incomplete (unable to prove 
results valid in the theory) or incorrect (able to prove results 
unprovable in the theory), the latter being more worrying, at 
least from the evaluation and certification perspective, as it 
may lead to an artificial paradox. And indeed such paradoxes 
have been discovered in well established tools. 

Clearly, implementing a formal method is a difficult task, 
dealing not only with completeness, correctness, but also with 
performance, automation, and ergonomy. In our view, the 
(potential) existence of bugs in a tool does not mean that 
it should not be used, but that the provided results should 
be considered with some care, and possibly verified by other 
mechanisms. This is addressed for example by [29], [35]. 

VI. Stepping Out of the Model 

We have discussed at length some concerns regarding the 
formal development of secure systems, through questionning 
paradoxes in the theory, bugs in the tools or more simply by 
identifying gotchas in the specifications. Let's now assume that 
we have been able to produce a consistent specification with 
security properties correctly expressed, and a compliant imple- 
mentation whose all proof obligations have been discharged, 
using a well-established formal method and a trusted tool - 
that is, we finally have a proven security system. That does not 
mean however that the system is secure, but that any attack has 
to contradict at least one of the hypotheses (a good heuristic 
for those willing to attack formally validated systems). 

Preconditions, for example, are hypotheses whose violation 
can be devastating, as illustrated in Par. IIV-CI But one should 
take care also to identify all the implicit hypotheses when 
developing a system or evaluating its security. Such implicit 
hypotheses are not only those that are introduced by the formal 
method (cf. Par. IIV-AI ). but also those that are related to the 
modelisation choices themselves. 

A. About Closure 

A frequent implicit hypothesis is related to the use of closure 
proofs. For example, proving a B machine requires proving 
the preservation of its invariant by any of its operations. This 
is justified if there is no other way to influence the system 
state than the provided operations. The extent to which this 
is enforced in the real system has to be carefully analysed. 
Threats considered during security analysis may reflect actions 
that are not in the model (data stored in files by proven 
applications can be modified by other applications, signals 
in electronic circuits can be jammed by fault injection, etc). 
There is no silver bullet to address this problem; current 
approaches include defensive style programming, redundancy, 
and dysfunctional considerations (e.g. by modelling errors 
such as unexpected values or inconsistent states). 



B. About Typing 

A second example of implicit hypothesis, much less ob- 
vious, is related to types. An adequate use of types in a 
specification (for example modelling IP addresses and ports 
as values of abstract sets rather than natural values) ensures 
that some forms of error will be automatically detected (such 
as using a port where an address is expected). But it is also 
important to understand how strong an hypothesis it is, and 
how easily it can be violated. Indeed, types are again logical 
information that have generally no concrete implementation; 
in most programming languages, they just disappear at compi- 
lation. So, while ill typed operation calls cannot be considered 
during formal analysis, they are in some cases executable. 

A typical example is provided in [36], describing a flaw in 
the PKCS#11 API for cryptographic resources, summarised 
here. A central authority (e.g. a bank) distributes crypto- 
graphic resources to customers. Such a resource can perform 
cryptographic operations, C ^ cipher(M, K) to cipher the 
message M with the key numbered K, or M ^uncipher{C,K) 
for the inverse operation. The resource never discloses keys 
to the customer, but permits exchange of keys with other 
resources through export of wrapped (cyphered) keys using 
D ^ export{K , W) where K is the number of the exported key 
and W the number of the wrapping key, and import{D, W, K) 
for the inverse operation (that stores internally the unwrapped 
key under number K without disclosing it). In a model where 
cyphertexts and wrapped keys are of different types, one can 
prove that no sequence of calls will disclose a sensitive key. 
Unfortunately the implementations of cyphertexts and wrapped 
keys are indistinguishable, and stored keys are not tagged with 
their role. It is so possible to disclose a key K with the (ill- 
typed) sequence export{K, W); M ^ uncipher{D, W). 

This demonstrates that it is important to identify implicit 
hypotheses associated to the use of types to detect possible 
consequences of type violations, or to maintain type informa- 
tion in the implementation to prevent such attacks. 

VII. Conclusion 

We summarise and discuss difficulties related to the devel- 
opment of secure systems using formal methods, identifying 
- where possible - proposals for improvement. The concerns 
described in this paper were identified during a systematic 
review of the process of formal development, investigating 
possible difficulties. 

A quick read of this paper could seem to imply that the 
reputation of formal methods to develop correct systems is 
overestimated. This is not our message. We consider that 
formal methods are very efficient tools to obtain high level 
of assurance and confidence for the development of systems 
in general, and of secure systems in particular 

Yet to fully benefit from such tools, one has to understand 
their strengths but also their limitations. Pretending that proven 
secure systems are perfectly secure is nothing more than a re- 
newed version of the first myth about formal methods pointed 
out in [37], and is to the least inadequate; in fact, we consider 
that such a claim is detrimental to formal methods. Taking 



this into account, we expect our proposals to help, where 
possible, for improving the quality of formal specifications and 
the adequacy of formal developments of secure systems (in 
some cases relying on other methodologies or technologies); 
our second expectation is to shed some light on the difficulties 
to at least allow for a better evaluation of the genuine level of 
confidence obtained through the use of formal methods. 

Nota: An extended version of this paper is available in French 
language at [38]. 
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